home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-02-16 | 9.0 KB | 154 lines | [TEXT/GEOL] |
- Item forwarded by A33 to A34
-
- Item 5735846 12-Feb-90 19:31PST
-
- From: D4384 US Voting Mach, Sarner, Calvin,PRT
-
- To: MACAPP.TECH$ MacApp Technical
-
- cc: D5295 Reseach SW Design, D Goldman,PRT
-
- Sub: Persistent Guerillas Needed
-
- What follows is a summary of the persistent object problem in MacApp, with some
- possible solutions, and a proposal for guerilla tactics.
-
- Persistent objects may be minimally defined as objects whose state persists
- across invocations of an application. Traditionally, persistent data on the Mac
- lives in documents. But there is not yet any general purpose mechanism for
- treating collections of objects as documents. Such a mechanism would allow
- opening a document to restore all objects in the document to the state they had
- when the document was closed, that is the state of being fully instantiated
- descendents of TObject. Ideally, the methods of the classes being instantiated
- would also be restored, but this is not possible without more compiler and
- linker support than MPW provides.
- Various solutions to the persistent object problem have been discussed here
- so far, and no doubt many others have been implemented:
- • Edmund noted that view resources are already a limited solution for
- descendents of TView.
- • Larry's Frameworks article presents a partial solution in the form of
- TStream, so long as there are not multiple references among persistent objects.
- Also, the entire stream must be read into memory.
- • Greg (me) presents a method (which Edmund happily calls virtual objects) for
- swapping persistent objects in and out of memory.
- • Someone (I've lost the link) describes keeping persistent objects on an
- Inside/Out database and bringing them into memory as needed.
-
- Each of these solutions is appropriate to its intended use, but all fall
- short of a general solution. A general solution would:
- 1) Allow a document to be treated as a collection of persistent descendents of
- TObject. No special code should be needed for object I/O.
- 2) Allow access to documents larger than available memory, even larger than
- virtual memory.
- 3) Allow persistent objects to refer to other persistent objects in arbitrary
- networks, including cycles.
- 4) Allow multiple users to have simultaneous access to the same collection of
- objects.
-
- Some ideas for achieving these goals follow.
-
- 1) Larry's use of metadata points the way here. The main criticism of his
- suggestion is that we must write code for each object to support I/O. I have
- not found this to be so bad, as each object can INHERIT the I/O methods of its
- ancestors and just add in code for its new fields. But there should be a better
- way. It seems one solution would be to use the Fields method of each object to
- drive a general I/O method, as the Fields method already provides information
- about the types and location of each field of an object. Of course we still
- need to write a Fields method for each object, but we are already supposed to
- be doing that (not that I do!). Ideally, the compiler would generate the fields
- method for us.
- A second issue is how to save and restore the object metadata. I would
- suggest that we created a resource type, similar to a view resource, for the
- purpose of saving the class names and field types of persistent objects. Each
- object on the data fork could then have in its header the ID of its class name
- resource.
-
- 2) Big documents present big problems. Virtual objects help, but require that
- at least a handle and a header remain in memory for each persistent object, and
- also require explicit locking of objects to be sure they are in memory before
- they are accessed. Even so, I think a general solution will include virtual
- objects. The idea is to keep a cache of the most active objects in memory, with
- less active objects being swapped out by writing their data to disk and
- resizing their handles down to just a small header, including the ClassID. This
- means that the methods of a virtual object can be dispatched whether or not the
- object is all in memory, but the data must be swapped in before it is
- referenced. Note that following the discipline of accessing object fields only
- through methods can largely eliminate concerns over when to swap, except for
- those methods themselves.
- An alternative solution is to keep objects in persistent collections (such
- as a database) and check objects in and out of the database as needed. This is
- less flexible than virtual objects, but can allow for arbitrarily large
- collections of objects. The main restriction is that an object in a collection
- must be referenced only as a member of that collection, and can be a member of
- at most one collection at a time. Also, the database must provide support for
- heterogeneous collections for some typical uses of objects to be effective
- (such as walking a list of heterogeneous objects in a Draw method). Relational
- databases explicitly forbid this.
- An integrated solution would allow both scalar virtual objects and objects
- which are in fact collections of other objects. MacApp provides only the TList
- type, whereas Smalltalk and others provide a rich set of collection classes. At
- the least a flat file, heterogeneous list, and ordered list (BTree) class could
- be provided, with an interface modeled on TList.
-
- 3) References among objects are a pain, especially arbitrary networks of
- references. The view architecture finesses this one by requiring views to form
- a hierarchy that can be traversed from the root, so that references to
- superviews and subviews can be resolved with finite searches.
- A more general solution proposed by Larry is to give each persistent object
- an ID (which might be their offset in the document file), refer to objects by
- ID, and maintain a translation table of ID's to handles. Then each ID
- reference can be resolved at runtime into a handle to the object. With a
- virtual object scheme, this table might be created and used only while the
- document is being opened, and could then be disposed of.
- The main problem for IDs is unresolved references. When an object is first
- read in its referee's IDs might not yet be in the translation table. So we
- might have to keep a list of unresolved references as we first traverse the
- document, then make a pass of this list to finally resolve them.
- If virtual object IDs are implemented as file offsets then they can be
- resolved as they are encountered. This works OK for trees and acyclic graphs,
- but could cause infinite regress in the presence of cycles. Perhaps one bit of
- the ID caould serve to indicate whether a reference is a file offset or a
- resolved handle, and thus stop the regress.
- The persistent collection approach allows references only to the values of
- objects, or their location in a collection, but never to their identity. This
- finesses the problem by simply not allowing direct references among objects.
-
- 4) Multi-user access presents all the problems of data integrity and deadlock
- of any distributed database. Implementing persistent collections on top of an
- existing database is by far the easiest solution.
- For virtual objects, the memory locking routines could be combined with a
- file range locking protocol to allow multi-user access. This would still
- require the programmer to prevent deadlock and maintain data integrity by
- careful design of access protocols (i.e. two-phase transaction locking). A
- really sophisticated lock manager process might be able to prevent deadlock, or
- at least detect it once underway, but this is usually too hard to do in
- general.
-
- In summary then, it appears that the problem of persistent objects is in
- need of a general solution, or class of solutions. I believe guerilla action is
- called for.
- At the least, a simple subclass of TObject could be provided which would
- standardize object I/O and metadata resources. Perhaps a clipboard type could
- be provided as well. Beyond that, a virtual object capability could also be
- easily provided, preferably one that could resolve multiple references. This
- much alone would allow a persistent document type to be defined which could be
- opened by any application which defined the necessary classes. For classes
- which are not defined by the application at least some methods could still be
- supported, at least enough to make copies and extract field data.
- The problems of multiple users and large collections are much harder. At
- least some abstract classes could be defined for interfacing to collections in
- databases, with access to particular databases provided by subclassing. Some
- simple concrete implementations would be nice, although they would probably not
- meet the performance needs of really big jobs (like CD-ROM). An abstract class
- approach might also be taken for multi-user access to virtual objects. In this
- way a framework could be provided for easy sharing of data among applications,
- with the most difficult, preformance critical, and application specific issues
- left to the capitalists to pay for.
- If you have read this far I take it you are interested, so where do think
- we should go from here?
-
- Yours truly,
-
- Greg Colvin
-
-